kermit.columbia.edu

home *** CD-ROM | disk | FTP | other *** search

/ kermit.columbia.edu / kermit.columbia.edu.tar / kermit.columbia.edu / newsgroups / misc.20000114-20000217 / 000048_news@columbia.edu _Mon Jan 17 19:25:56 2000.msg < prev next >

Wrap

Internet Message Format | 2000-02-16 | 5KB

Return-Path: <news@columbia.edu> Received: from newsmaster.cc.columbia.edu (newsmaster.cc.columbia.edu [128.59.59.30]) by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id TAA24210 for <kermit.misc@watsun.cc.columbia.edu>; Mon, 17 Jan 2000 19:25:56 -0500 (EST) Received: (from news@localhost) by newsmaster.cc.columbia.edu (8.8.5/8.8.5) id TAA28343 for kermit.misc@watsun.cc.columbia.edu; Mon, 17 Jan 2000 19:06:58 -0500 (EST) X-Authentication-Warning: newsmaster.cc.columbia.edu: news set sender to <news> using -f From: fdc@watsun.cc.columbia.edu (Frank da Cruz) Subject: Case Study #10: Atomic File Movement Date: 18 Jan 2000 00:06:57 GMT Organization: Columbia University Message-ID: <860ar1$rlj$1@newsmaster.cc.columbia.edu> To: kermit.misc@columbia.edu Today let's look at the common situation in which files must be moved from one computer to another for processing on a regular basis. For example, daily business receipts are sent from a branch office or franchise to company headquarters, or medical or pharmaceutical insurance claims from a doctor's office, hospital, or pharmacy to a claims clearinghouse. Each file contains a series of financial transactions, so we need to ensure that each transaction occurs once and only once, and when it occurs, it occurs completely and correctly. Of course other applications can be imagined too. Let's call the two parties "Branch" and "Headquarters" (HQ). In a typical scenario, Branch collects files (e.g. from each operator station) into a directory and then transmits them every evening to HQ. The connection can be made by traditional (non-PPP) dialup or by network. Of course Kermit is equally suited to both. (That's a strong point of Kermit, remember? For example, if you normally use a network connection but the net is broken, you can fall back up old-fashioned dialup using the same script if it is well-designed.) The procedures for making the connection are well documented in the Kermit manuals. Let's assume we have a connection already, we have already authenticated or logged in, and there is a Kermit server on the far end. Let's also assume that our current directory on the local computer contains the files we need to send, and there are many of them. Of course we can just tell the local Kermit to "SEND *.*" or whatever, but what happens if the connection breaks and we have to start again? We don't want HQ to receive multiple copies of the same transaction. (Obviously there should be other safeguards but we won't discuss them here.) There are several approaches to this problem, but the best one is Kermit's new "atomic file movement" feature. In this case "atomic" is used in the computer-science sense, not the physics one :-) The command is simple: SEND /DELETE *.* This means, send all the files whose names match "*.*" (or any other pattern or filename) and delete each one as soon as, and only if, it was sent successfully (MOVE is a synonym for SEND /DELETE). Alternatively, you can use: SEND /MOVE-TO:xxxx *.* which, instead of deleting each successfully sent file, moves it to the directory named xxxx. (A third choice, SEND /RENAME-TO:, is described in the update notes.) Now if the connection is lost, you can make a new connection and give the same SEND /DELETE or SEND /MOVE-TO command again, and it sends only the files that were not already sent successfully, because the ones that were are gone. Meanwhile, back at Headquarters we encounter the classic conundrum: how to know when a file has been completely uploaded? Let's suppose some process at HQ (besides Kermit) waits for new files to appear in the upload directory. Well, each file "appears" as soon as it is opened, but it might be open for some time while the Kermit receiver is writing new material to it (the same is true, of course, for FTP). We don't want to start processing it until it has arrived completely, but we also don't want to wait forever. Here again, atomic file movement is the answer. If the Kermit server at HQ is given the command: SET RECEIVE MOVE-TO xxxx (where xxxx is the name of a directory), this tells it to move each received file to the specified directory after, and only if, it is received successfully. So the script to start up the server at HQ might look like this: cd /incoming/tmp/ set receive move-to /incoming/ready/ server exit The underlying API is chosen to be atomic; for example the UNIX rename() system call is used (or link() when rename() is not available); the instant the file appears in the /incoming/ready/ directory, it's ready to use and not in the middle of being copied. And it won't come back to haunt you again after processing, because the Branch won't upload it again. As for making sure the files get through despite repeated disconnections, see the 'deliver' script on page 453 of "Using C-Kermit" or in the C-Kermit script library: ftp://kermit.columbia.edu/kermit/scripts/ckermit/deliver For details about atomic file movement, see Sections 4.0.8, 4.1.3, 4.7 of the ckermit2.txt file. - Frank